Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study

نویسندگان

Marco Mirolli

Vieri G. Santucci

Gianluca Baldassarre

چکیده

An important issue of recent neuroscientific research is to understand the functional role of the phasic release of dopamine in the striatum, and in particular its relation to reinforcement learning. The literature is split between two alternative hypotheses: one considers phasic dopamine as a reward prediction error similar to the computational TD-error, whose function is to guide an animal to maximize future rewards; the other holds that phasic dopamine is a sensory prediction error signal that lets the animal discover and acquire novel actions. In this paper we propose an original hypothesis that integrates these two contrasting positions: according to our view phasic dopamine represents a TD-like reinforcement prediction error learning signal determined by both unexpected changes in the environment (temporary, intrinsic reinforcements) and biological rewards (permanent, extrinsic reinforcements). Accordingly, dopamine plays the functional role of driving both the discovery and acquisition of novel actions and the maximization of future rewards. To validate our hypothesis we perform a series of experiments with a simulated robotic system that has to learn different skills in order to get rewards. We compare different versions of the system in which we vary the composition of the learning signal. The results show that only the system reinforced by both extrinsic and intrinsic reinforcements is able to reach high performance in sufficiently complex conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Biological cumulative learning through intrinsic motivations : A simulated robotic study of the development of visually-guided reaching

This work aims to model the ability of biological organisms to achieve cumulative learning, i.e. to learn increasingly more complex skills on the basis of simpler ones. In particular, we studied how a simulated kinematic robotic system composed of an arm and an eye can learn the ability to reach for an object on the basis of the ability to systematically look at the object, which, in our set-up...

متن کامل

Biological cumulative learning through intrinsic motivations A simulated robotic study on the development of visually-guided reaching

This work aims to model the ability of biological organisms to achieve cumulative learning, i.e. learning increasingly more complex skills on the basis of simpler ones. In particular, we studied how a simulated kinematic robotic system composed of an arm and an eye can learn the ability to reach for an object on the basis of the ability to systematically look at the object, which, in our set-up...

متن کامل

Influence of Extrinsic and Intrinsic Rewards on Employee Engagement (Empirical Study in Public Sector of Uganda)

Considerable attention has been given to the identification of key forms of reward and its linkage to employee engagement. For this purpose following study aims to uncover the influence of extrinsic and intrinsic rewards on employee engagement in the public sector of Uganda. A sample of 184 public sector employees was randomly selected and taken from Gulu district. A quantita...

متن کامل

The phasic dopamine signal maturing: from reward via behavioural activation to formal economic utility.

The phasic dopamine reward prediction error response is a major brain signal underlying learning, approach and decision making. This dopamine response consists of two components that reflect, initially, stimulus detection from physical impact and, subsequenttly, reward valuation; dopamine activations by punishers reflect physical impact rather than aversiveness. The dopamine reward signal is di...

متن کامل

Statistics of midbrain dopamine neuron spike trains in the awake primate.

Work in behaving primates indicates that midbrain dopamine neurons encode a prediction error, the difference between an obtained reward and the reward expected. Studies of dopamine action potential timing in the alert and anesthetized rat indicate that dopamine neurons respond in tonic and phasic modes, a distinction that has been less well characterized in the primates. We used spike train mod...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Neural networks : the official journal of the International Neural Network Society

دوره 39 شماره

صفحات -

تاریخ انتشار 2013

Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study

نویسندگان

چکیده

منابع مشابه

Biological cumulative learning through intrinsic motivations : A simulated robotic study of the development of visually-guided reaching

Biological cumulative learning through intrinsic motivations A simulated robotic study on the development of visually-guided reaching

Influence of Extrinsic and Intrinsic Rewards on Employee Engagement (Empirical Study in Public Sector of Uganda)

The phasic dopamine signal maturing: from reward via behavioural activation to formal economic utility.

Statistics of midbrain dopamine neuron spike trains in the awake primate.

عنوان ژورنال:

اشتراک گذاری